Machine Learning Friendly Set Version of Johnson-Lindenstrauss Lemma

نویسنده

  • Mieczyslaw A. Klopotek
چکیده

In this paper we make a novel use of the Johnson-Lindenstrauss Lemma. The Lemma has an existential form saying that there exists a JL transformation f of the data points into lower dimensional space such that all of them fall into predefined error range δ. We formulate in this paper a theorem stating that we can choose the target dimensionality in a random projection type JL linear transformation in such a way that with probability 1− all of them fall into predefined error range δ for any user-predefined failure probability . This result is important for applications such a data clustering where we want to have a priori dimensionality reducing transformation instead of trying out a (large) number of them, as with traditional Johnson-Lindenstrauss Lemma. In particular, we take a closer look at the k-means algorithm and prove that a good solution in the projected space is also a good solution in the original space. Furthermore, under proper assumptions local optima in the original space are also ones in the projected space. We define also conditions for which clusterability property of the original space is transmitted to the projected space, so that special case algorithms for the original space are also applicable in the projected space.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

236779: Foundations of Algorithms for Massive Datasets Lecture 4 the Johnson-lindenstrauss Lemma

The Johnson-Lindenstrauss lemma and its proof This lecture aims to prove the Johnson–Lindenstrauss lemma. Since the lemma is proved easily with another interesting lemma, a part of this lecture is focused on the proof of this second lemma. At the end, the optimality of the Johnson–Lindenstrauss lemma is discussed. Lemma 1 (Johnson-Lindenstrauss). Given the initial space X ⊆ R n s.t. |X| = N , <...

متن کامل

An Elementary Proof of the Johnson-lindenstrauss Lemma

The Johnson-Lindenstrauss lemma shows that a set of n points in high dimensional Euclidean space can be mapped down into an O(log n== 2) dimensional Euclidean space such that the distance between any two points changes by only a factor of (1). In this note, we prove this lemma using elementary probabilistic techniques.

متن کامل

Geometric Optimization April 12 , 2007 Lecture 25 : Johnson Lindenstrauss Lemma

The topic of this lecture is dimensionality reduction. Many problems have been efficiently solved in low dimensions, but very often the solution to low-dimensional spaces are impractical for high dimensional spaces because either space or running time is exponential in dimension. In order to address the curse of dimensionality, one technique is to map a set of points in a high dimensional space...

متن کامل

On variants of the Johnson-Lindenstrauss lemma

The Johnson–Lindenstrauss lemma asserts that an n-point set in any Euclidean space can be mapped to a Euclidean space of dimension k = O(ε−2 log n) so that all distances are preserved up to a multiplicative factor between 1 − ε and 1 + ε. Known proofs obtain such a mapping as a linear map Rn → Rk with a suitable random matrix. We give a simple and self-contained proof of a version of the Johnso...

متن کامل

Lecture 6 : Johnson - Lindenstrauss Lemma : Dimension Reduction

Observer that for any three points, if the three distances between them are given, then the three angles are fixed. Given n−1 vectors, the vectors together with the origin form a set of n points. In fact, given any n points in Euclidean space (in n−1 dimensions), the Johnson-Lindenstrauss Lemma states that the n points can be placed in O( logn 2 ) dimensions such that distances are preserved wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1703.01507  شماره 

صفحات  -

تاریخ انتشار 2017